Obsidian Metadata

urlhttps://gist.github.com/Jeandcc/2479b50254499bcf6ff91920f5266a2a
authorJeandcc
description"Scraping" site for NotebookLM. GitHub Gist: instantly share code, notes, and snippets.

Option 1: Use the Sitemap1. Visit the site’s /sitemap.xml page.

  1. Convert the XML to JSON using a tool like this one.
  2. Extract the URLs programmatically from the JSON structure in your browser console.

Array.from(document.querySelectorAll(‘a[class*=“sidebar-link_”]‘)).map(el el.href);

Adjust the selector if needed, based on the structure of the page.

Once you have the URLs, copy them using copy() in the browser console, and define them in the NotebookLM browser console tab:

const urls = [ “https://example.com/page1”, “https://example.com/page2”, // etc. ];


🧱 Step 2: Define the addSource Function1. In NotebookLM, with DevTools opened, add a dummy source (e.g., https://facebook.com) to your Notebook.

  1. Open DevTools’s Network Tab, and locate the corresponding fetch request.
  2. Copy the request as fetch.

It will look something like this:

fetch(“https://notebooklm.google.com/…”, { headers: { … }, body: `…${encodeURIComponent(url)}…`, method: “POST”, mode: “cors”, credentials: “include” });

  1. Wrap it in a addSource function:

const addSource = (url) { fetch(“https://notebooklm.google.com/…”, { headers: { … }, body: `…${encodeURIComponent(url)}…`, method: “POST”, mode: “cors”, credentials: “include” }); };

🔧 Replace any hardcoded URL in the request body (e.g., https%3A%2F%2Ffacebook.com) with ${encodeURIComponent(url)}.


🚀 Step 3: Submit the URLsOnce the function is defined, loop through all the URLs to submit them:

for (const url of urls) { addSource(url); }


⚠️ Final Note> Don’t refresh the page immediately! Wait until all HTTP requests have been received and processed by Google before refreshing NotebookLM.